Moyal distribution — a Landau-like energy-loss model#

The Moyal distribution (moyal) is a continuous, right-skewed location–scale distribution on the real line. It is best known as a remarkably accurate analytic approximation to the Landau distribution, used in particle physics to model ionization energy loss in thin absorbers.

What you’ll learn#

  • how the moyal PDF and CDF are defined (and why the CDF has a closed form via erfc)

  • closed-form mean/variance/skewness/kurtosis, MGF/CF (domain!), and differential entropy

  • a clean latent-variable representation: \(X = \mu - 2\sigma\log|Z|\) with \(Z\sim\mathcal{N}(0,1)\)

  • a NumPy-only sampler + Monte Carlo validation

  • practical usage via scipy.stats.moyal (pdf, cdf, rvs, fit)


import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

from scipy import optimize, special, stats

# Plotly rendering (CKC convention)
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

# Reproducibility
rng = np.random.default_rng(7)

np.set_printoptions(precision=4, suppress=True)

1) Title & Classification#

Name: moyal (Moyal distribution)
Type: continuous
Support: \(x \in (-\infty, \infty)\)

Parameter space (SciPy-compatible location–scale form)#

We use the parameterization

  • location: \(\mu \in \mathbb{R}\)

  • scale: \(\sigma > 0\)

Notation: \(X\sim\mathrm{Moyal}(\mu,\sigma)\).

The standard Moyal is \(\mathrm{Moyal}(0,1)\).

2) Intuition & Motivation#

What this distribution models#

The Moyal distribution is a convenient analytic model for strongly right-skewed continuous measurements. Its most common story comes from particle physics:

  • a charged particle traversing a thin material loses energy mostly through many small interactions,

  • plus occasional large energy-transfer events.

That mechanism produces a distribution with a sharp peak and a long right tail (rare large losses). The physically-motivated Landau distribution is often used for this, and the Moyal distribution provides a closed-form approximation that is easy to compute and fit.

Typical real-world use cases#

  • Ionization energy loss (\(dE/dx\)) in thin absorbers (Landau-like “straggling”).

  • Right-skewed measurement noise where large positive excursions occur (rare, extreme events).

  • As a component in mixture models when you need a peaked density with an exponential right tail.

Relations to other distributions#

A very useful identity connects the Moyal to Gaussian / chi-square variables. For \(Z\sim\mathrm{Moyal}(0,1)\):

  • \(Y = \exp(-Z/2)\) has a half-normal distribution: \(Y\sim |\mathcal{N}(0,1)|\).

  • Equivalently, \(W = \exp(-Z)\) has a chi-square distribution with 1 degree of freedom: \(W\sim\chi^2_1\).

These transformations make the CDF and many moments essentially “free”.

3) Formal Definition#

Define the standardized variable

\[Z = \frac{X-\mu}{\sigma}.\]

PDF#

For the standard Moyal (\(\mu=0,\sigma=1\)):

\[ f_Z(z) = \frac{1}{\sqrt{2\pi}}\exp\left[-\frac{1}{2}\left(z + e^{-z}\right)\right],\qquad z\in\mathbb{R}. \]

For general \((\mu,\sigma)\) (location–scale transform):

\[ f_X(x;\mu,\sigma) = \frac{1}{\sigma\sqrt{2\pi}}\exp\left[-\frac{1}{2}\left(\frac{x-\mu}{\sigma} + \exp\left(-\frac{x-\mu}{\sigma}\right)\right)\right],\qquad x\in\mathbb{R}. \]

CDF#

A closed form exists using the complementary error function \(\operatorname{erfc}\):

\[ F_Z(z) = \operatorname{erfc}\!\left(\frac{e^{-z/2}}{\sqrt{2}}\right), \qquad F_X(x)=F_Z\!\left(\frac{x-\mu}{\sigma}\right). \]

Recall \(\operatorname{erfc}(u)=1-\operatorname{erf}(u)\).

Quantile function (inverse CDF)#

Because the CDF has the form \(\operatorname{erfc}(\text{something exponential})\), the quantile is also explicit:

\[ F_Z^{-1}(p) = -2\log\left(\sqrt{2}\,\operatorname{erfc}^{-1}(p)\right),\qquad p\in(0,1), \]

and \(F_X^{-1}(p)=\mu + \sigma F_Z^{-1}(p)\).

We’ll implement numerically stable logpdf-first computations below.

SQRT_2 = np.sqrt(2.0)
LOG_SQRT_2PI = 0.5 * np.log(2.0 * np.pi)


def _check_scale(sigma: float) -> float:
    sigma = float(sigma)
    if (not np.isfinite(sigma)) or sigma <= 0:
        raise ValueError("sigma must be positive and finite")
    return sigma


def moyal_logpdf(x, mu: float = 0.0, sigma: float = 1.0) -> np.ndarray:
    '''Log-PDF of Moyal(mu, sigma).'''

    x = np.asarray(x, dtype=float)
    mu = float(mu)
    sigma = _check_scale(sigma)

    z = (x - mu) / sigma

    # exp(-z) can overflow for very negative z; clipping avoids spurious inf warnings.
    exp_neg_z = np.exp(np.clip(-z, -745, 709))

    return -np.log(sigma) - LOG_SQRT_2PI - 0.5 * (z + exp_neg_z)


def moyal_pdf(x, mu: float = 0.0, sigma: float = 1.0) -> np.ndarray:
    return np.exp(moyal_logpdf(x, mu, sigma))


def moyal_cdf(x, mu: float = 0.0, sigma: float = 1.0) -> np.ndarray:
    x = np.asarray(x, dtype=float)
    mu = float(mu)
    sigma = _check_scale(sigma)

    z = (x - mu) / sigma
    t = np.exp(np.clip(-0.5 * z, -745, 709)) / SQRT_2
    return special.erfc(t)


def moyal_ppf(p, mu: float = 0.0, sigma: float = 1.0) -> np.ndarray:
    p = np.asarray(p, dtype=float)
    mu = float(mu)
    sigma = _check_scale(sigma)

    if np.any((p <= 0) | (p >= 1)):
        raise ValueError("p must be strictly between 0 and 1")

    z = -2.0 * np.log(SQRT_2 * special.erfcinv(p))
    return mu + sigma * z

4) Moments & Properties#

Let \(X\sim\mathrm{Moyal}(\mu,\sigma)\). Let \(\gamma\) be the Euler–Mascheroni constant and \(\zeta\) the Riemann zeta function.

Quantity

Value

Mean

\(\mathbb{E}[X] = \mu + \sigma(\gamma + \ln 2)\)

Variance

\(\mathrm{Var}(X) = \frac{\pi^2}{2}\,\sigma^2\)

Skewness

\(\displaystyle \frac{28\sqrt{2}\,\zeta(3)}{\pi^3}\)

Excess kurtosis

\(4\) (so kurtosis is \(7\))

Mode

\(\mu\)

MGF

\(\displaystyle M_X(t)=e^{\mu t}2^{-\sigma t}\frac{\Gamma(\tfrac12-\sigma t)}{\Gamma(\tfrac12)},\; t<\frac{1}{2\sigma}\)

Characteristic function

\(\displaystyle \varphi_X(t)=e^{i\mu t}2^{-i\sigma t}\frac{\Gamma(\tfrac12-i\sigma t)}{\Gamma(\tfrac12)}\)

Differential entropy

\(\displaystyle h(X)=\ln\sigma + \frac{1+\gamma+\ln(4\pi)}{2}\)

Tail behavior. As \(x\to+\infty\), \(f(x)\) behaves like an exponential density with rate \(1/(2\sigma)\) (long right tail). As \(x\to-\infty\), \(f(x)\) decays super-exponentially because of the \(\exp(-e^{-z}/2)\) term.

# Closed-form constants (standard Moyal)

skew_standard = 28.0 * np.sqrt(2.0) * special.zeta(3.0) / (np.pi**3)
excess_kurt_standard = 4.0

skew_standard, excess_kurt_standard
(1.5351415907229062, 4.0)
# Compare theoretical moments to SciPy's implementation

mu, sigma = 1.0, 0.8

mean_theory = mu + sigma * (np.euler_gamma + np.log(2.0))
var_theory = (np.pi**2 / 2.0) * sigma**2

mean_scipy, var_scipy, skew_scipy, excess_kurt_scipy = stats.moyal.stats(
    loc=mu, scale=sigma, moments="mvsk"
)

np.array([
    mean_theory, mean_scipy,
    var_theory, var_scipy,
    skew_standard, skew_scipy,
    excess_kurt_standard, excess_kurt_scipy,
])
array([2.0163, 2.0163, 3.1583, 3.1583, 1.5351, 1.5351, 4.    , 4.    ])

5) Parameter Interpretation#

The Moyal distribution is a location–scale family, so the parameters have very direct meanings:

  • Location \(\mu\) shifts the distribution left/right. In fact, \(\mu\) is the mode (the location of the peak).

  • Scale \(\sigma\) stretches the distribution. Variance grows like \(\sigma^2\):

\[\mathrm{Var}(X) = \frac{\pi^2}{2}\sigma^2.\]

A common pitfall is to treat \(\mu\) as a “mean parameter”. The mean is larger than the mode due to right skew:

\[\mathbb{E}[X] = \mu + \sigma(\gamma + \ln 2).\]

Because this is a location–scale family, skewness and (excess) kurtosis do not depend on \(\mu\) or \(\sigma\).

# How (mu, sigma) change the shape

x = np.linspace(-6, 12, 800)

fig = go.Figure()
for mu_ in [-2.0, 0.0, 2.0]:
    fig.add_trace(
        go.Scatter(x=x, y=moyal_pdf(x, mu=mu_, sigma=1.0), mode="lines", name=f"mu={mu_}, sigma=1")
    )

fig.update_layout(
    title="Effect of location (sigma fixed)",
    xaxis_title="x",
    yaxis_title="pdf",
    template="plotly_white",
)
fig.show()
x = np.linspace(-6, 16, 900)

fig = go.Figure()
for sigma_ in [0.5, 1.0, 2.0]:
    fig.add_trace(
        go.Scatter(x=x, y=moyal_pdf(x, mu=0.0, sigma=sigma_), mode="lines", name=f"mu=0, sigma={sigma_}")
    )

fig.update_layout(
    title="Effect of scale (mu fixed)",
    xaxis_title="x",
    yaxis_title="pdf",
    template="plotly_white",
)
fig.show()

6) Derivations#

This section focuses on three derivations that you can reuse:

  1. a transformation that explains the closed-form CDF,

  2. a transformation that gives mean and variance quickly,

  3. the likelihood used for MLE and Bayesian inference.

6.1 CDF via a half-normal transformation#

For the standard Moyal \(Z\sim\mathrm{Moyal}(0,1)\), define

\[Y = \exp(-Z/2).\]

This map is one-to-one from \(\mathbb{R}\to(0,\infty)\) and is decreasing. Using change of variables with \(z=-2\log y\) and \(\left|\frac{dz}{dy}\right|=\frac{2}{y}\):

\[ f_Y(y) = f_Z(-2\log y)\,\frac{2}{y} = \frac{2}{\sqrt{2\pi}}e^{-y^2/2},\qquad y>0. \]

That is exactly the half-normal density, so \(Y\sim|\mathcal{N}(0,1)|\). Then, because \(Y\) is decreasing in \(Z\),

\[ F_Z(z)=\mathbb{P}(Z\le z)=\mathbb{P}(Y\ge e^{-z/2}). \]

For a half-normal, \(\mathbb{P}(Y\ge a)=\operatorname{erfc}(a/\sqrt{2})\), giving

\[ F_Z(z)=\operatorname{erfc}\!\left(\frac{e^{-z/2}}{\sqrt{2}}\right). \]

6.2 Mean and variance via a chi-square transformation#

From the same identity, \(W=Y^2=\exp(-Z)\) has density

\[f_W(w)=\frac{1}{\sqrt{2\pi}}w^{-1/2}e^{-w/2},\qquad w>0,\]

so \(W\sim\chi^2_1\), i.e. \(W\sim\mathrm{Gamma}(\alpha=\tfrac12,\;\theta=2)\). But \(Z=-\log W\), so moments of \(Z\) are moments of \(\log W\).

For \(W\sim\mathrm{Gamma}(\alpha,\theta)\):

  • \(\mathbb{E}[\log W]=\psi(\alpha)+\log\theta\) (digamma),

  • \(\mathrm{Var}(\log W)=\psi_1(\alpha)\) (trigamma).

With \(\alpha=\tfrac12\) and \(\theta=2\):

\[ \mathbb{E}[Z] = -\mathbb{E}[\log W] = -\big(\psi(\tfrac12)+\log 2\big) = \gamma + \log 2, \]
\[ \mathrm{Var}(Z) = \mathrm{Var}(\log W)=\psi_1(\tfrac12)=\frac{\pi^2}{2}. \]

Finally, for \(X=\mu+\sigma Z\) we get

\[\mathbb{E}[X]=\mu+\sigma(\gamma+\log 2),\qquad \mathrm{Var}(X)=\sigma^2\frac{\pi^2}{2}.\]

6.3 Likelihood#

For i.i.d. data \(x_1,\dots,x_n\), the likelihood is

\[ L(\mu,\sigma)=\prod_{i=1}^n f_X(x_i;\mu,\sigma). \]

Using \(z_i=(x_i-\mu)/\sigma\), the log-likelihood is

\[ \ell(\mu,\sigma) = -n\log\sigma - \frac{n}{2}\log(2\pi) -\frac12\sum_{i=1}^n\left(z_i + e^{-z_i}\right). \]

There is no closed-form MLE; in practice we maximize \(\ell\) numerically.

def moyal_nll_mu_logsigma(params: np.ndarray, x: np.ndarray) -> float:
    '''Negative log-likelihood with sigma parameterized as exp(logsigma).'''

    mu, logsigma = params
    sigma = float(np.exp(logsigma))
    return float(-np.sum(moyal_logpdf(x, mu=mu, sigma=sigma)))


# Synthetic data for likelihood / fitting demo
mu_true, sigma_true = 2.0, 1.1
x_fit = stats.moyal.rvs(loc=mu_true, scale=sigma_true, size=2000, random_state=rng)

# Simple initialization: mode ~ median-ish, scale from variance identity
mu0 = float(np.median(x_fit))
sigma0 = float(np.sqrt(2.0) * np.std(x_fit) / np.pi)

res = optimize.minimize(
    moyal_nll_mu_logsigma,
    x0=np.array([mu0, np.log(sigma0)]),
    args=(x_fit,),
    method="L-BFGS-B",
)

mu_mle = float(res.x[0])
sigma_mle = float(np.exp(res.x[1]))

mu_fit_scipy, sigma_fit_scipy = stats.moyal.fit(x_fit)  # MLE

{
    'true': (mu_true, sigma_true),
    'mle_optimize': (mu_mle, sigma_mle),
    'mle_scipy_fit': (mu_fit_scipy, sigma_fit_scipy),
    'opt_success': bool(res.success),
}
{'true': (2.0, 1.1),
 'mle_optimize': (2.0154038243715915, 1.0690070365107665),
 'mle_scipy_fit': (2.0153779868371826, 1.0689827738192625),
 'opt_success': True}

7) Sampling & Simulation#

A convenient sampling identity follows directly from the half-normal transform. For \(Z\sim\mathrm{Moyal}(0,1)\) we showed that

\[Y=\exp(-Z/2)\sim |\mathcal{N}(0,1)|.\]

Solve for \(Z\):

\[Z = -2\log Y.\]

So to sample \(X\sim\mathrm{Moyal}(\mu,\sigma)\):

  1. Sample \(U\sim\mathcal{N}(0,1)\) using NumPy.

  2. Set \(Y=|U|\) (half-normal).

  3. Return \(X = \mu + \sigma\,(-2\log Y) = \mu - 2\sigma\log|U|\).

This uses only NumPy’s normal RNG plus a log and absolute value.

def moyal_rvs_numpy(mu: float, sigma: float, size: int, rng: np.random.Generator) -> np.ndarray:
    '''Sample Moyal(mu, sigma) using NumPy only.

    Uses: X = mu - 2*sigma*log|U|, U ~ Normal(0,1).
    '''

    mu = float(mu)
    sigma = _check_scale(sigma)

    u = rng.standard_normal(size)
    y = np.abs(u)

    # Avoid log(0) if u happens to underflow to exactly 0 (extremely rare but possible).
    y = np.maximum(y, np.finfo(float).tiny)

    z = -2.0 * np.log(y)
    return mu + sigma * z


mu_s, sigma_s = 1.5, 0.9
samples = moyal_rvs_numpy(mu_s, sigma_s, size=200_000, rng=rng)

mean_mc = samples.mean()
var_mc = samples.var()

mean_theory = mu_s + sigma_s * (np.euler_gamma + np.log(2.0))
var_theory = (np.pi**2 / 2.0) * sigma_s**2

{
    'mean_mc': float(mean_mc),
    'mean_theory': float(mean_theory),
    'var_mc': float(var_mc),
    'var_theory': float(var_theory),
}
{'mean_mc': 2.6485366779464337,
 'mean_theory': 2.6433265609153302,
 'var_mc': 4.024978622724565,
 'var_theory': 3.99718978244119}
# Quick goodness-of-fit check against SciPy's CDF (one-sample KS with known parameters)
ks = stats.kstest(samples[:20_000], 'moyal', args=(mu_s, sigma_s))
ks
KstestResult(statistic=0.006157959722548512, pvalue=0.43246486363740355, statistic_location=3.503244691922463, statistic_sign=-1)

8) Visualization#

We’ll visualize:

  • the theoretical PDF and CDF

  • Monte Carlo samples from the NumPy-only sampler

  • an empirical CDF overlay

mu_v, sigma_v = 0.0, 1.0
x = np.linspace(-6, 14, 900)

pdf = moyal_pdf(x, mu=mu_v, sigma=sigma_v)
cdf = moyal_cdf(x, mu=mu_v, sigma=sigma_v)

fig_pdf = go.Figure(go.Scatter(x=x, y=pdf, mode='lines', name='pdf'))
fig_pdf.update_layout(title='Moyal PDF', xaxis_title='x', yaxis_title='density', template='plotly_white')
fig_pdf.show()

fig_cdf = go.Figure(go.Scatter(x=x, y=cdf, mode='lines', name='cdf'))
fig_cdf.update_layout(title='Moyal CDF', xaxis_title='x', yaxis_title='F(x)', template='plotly_white')
fig_cdf.show()
# Monte Carlo samples + PDF overlay

mu_v, sigma_v = 0.3, 1.2
s = moyal_rvs_numpy(mu_v, sigma_v, size=80_000, rng=rng)

xgrid = np.linspace(np.quantile(s, 0.001), np.quantile(s, 0.999), 600)

fig = px.histogram(s, nbins=80, histnorm='probability density', title='Monte Carlo histogram (NumPy-only sampler)')
fig.add_trace(go.Scatter(x=xgrid, y=moyal_pdf(xgrid, mu=mu_v, sigma=sigma_v), mode='lines', name='theoretical pdf'))
fig.update_layout(xaxis_title='x', yaxis_title='density', template='plotly_white')
fig.show()
# Empirical CDF vs theoretical CDF

s_sorted = np.sort(s)
emp_cdf = np.arange(1, len(s_sorted) + 1) / len(s_sorted)

xgrid = np.linspace(np.quantile(s_sorted, 0.001), np.quantile(s_sorted, 0.999), 600)

fig = go.Figure()
fig.add_trace(go.Scatter(x=s_sorted[::200], y=emp_cdf[::200], mode='markers', name='empirical cdf'))
fig.add_trace(go.Scatter(x=xgrid, y=moyal_cdf(xgrid, mu=mu_v, sigma=sigma_v), mode='lines', name='theoretical cdf'))
fig.update_layout(title='CDF: empirical vs theoretical', xaxis_title='x', yaxis_title='F(x)', template='plotly_white')
fig.show()

9) SciPy Integration (scipy.stats.moyal)#

SciPy provides the Moyal distribution as a location–scale family:

  • stats.moyal.pdf(x, loc=mu, scale=sigma)

  • stats.moyal.cdf(x, loc=mu, scale=sigma)

  • stats.moyal.rvs(loc=mu, scale=sigma, size=..., random_state=...)

  • stats.moyal.fit(data) (MLE for loc, scale)

Let’s verify agreement with our NumPy/SciPy-special implementation.

mu, sigma = -0.7, 1.4
x = np.linspace(-8, 16, 800)

pdf_ours = moyal_pdf(x, mu=mu, sigma=sigma)
pdf_scipy = stats.moyal.pdf(x, loc=mu, scale=sigma)

cdf_ours = moyal_cdf(x, mu=mu, sigma=sigma)
cdf_scipy = stats.moyal.cdf(x, loc=mu, scale=sigma)

{
    'max_abs_pdf_diff': float(np.max(np.abs(pdf_ours - pdf_scipy))),
    'max_abs_cdf_diff': float(np.max(np.abs(cdf_ours - cdf_scipy))),
}
{'max_abs_pdf_diff': 5.551115123125783e-17, 'max_abs_cdf_diff': 0.0}
# Fit parameters from data

x_data = stats.moyal.rvs(loc=1.2, scale=0.9, size=1500, random_state=rng)

loc_hat, scale_hat = stats.moyal.fit(x_data)
loc_hat, scale_hat
(1.1690578312389472, 0.9084984164318166)

10) Statistical Use Cases#

10.1 Hypothesis testing / model checking#

Common tasks:

  • Goodness-of-fit: does a fitted Moyal model describe the data?

  • Model comparison: is Moyal a better fit than (say) a normal or lognormal?

We’ll use two practical tools:

  • a KS statistic as a quick diagnostic (with a warning about fitting),

  • AIC as a likelihood-based model comparison.

10.2 Bayesian modeling#

A simple Bayesian model treats \((\mu,\sigma)\) as unknown parameters:

\[x_i\mid\mu,\sigma \stackrel{\text{i.i.d.}}{\sim} \mathrm{Moyal}(\mu,\sigma).\]

There is no conjugate prior, but the log-likelihood is easy to compute, so MCMC works well. We’ll implement a small random-walk Metropolis sampler.

10.3 Generative modeling#

The latent-variable identity

\[X = \mu - 2\sigma\log|U|,\quad U\sim\mathcal{N}(0,1)\]

is already a generative model (with latent \(U\)). It also makes it easy to build mixtures like

\[X\sim \pi\,\mathrm{Moyal}(\mu_1,\sigma_1) + (1-\pi)\,\mathrm{Moyal}(\mu_2,\sigma_2).\]
# 10.1 Quick model comparison: Moyal vs Normal (AIC) + KS diagnostics

x = stats.moyal.rvs(loc=0.8, scale=0.7, size=600, random_state=rng)

# Fit both models by MLE
moyal_loc, moyal_scale = stats.moyal.fit(x)
normal_loc, normal_scale = stats.norm.fit(x)

ll_moyal = float(np.sum(stats.moyal.logpdf(x, loc=moyal_loc, scale=moyal_scale)))
ll_norm = float(np.sum(stats.norm.logpdf(x, loc=normal_loc, scale=normal_scale)))

# AIC = 2k - 2 log L (k=2 parameters for both models)
aic_moyal = 2 * 2 - 2 * ll_moyal
aic_norm = 2 * 2 - 2 * ll_norm

# KS test note: p-values are not exact when parameters are estimated from the same data.
ks_moyal = stats.kstest(x, 'moyal', args=(moyal_loc, moyal_scale))
ks_norm = stats.kstest(x, 'norm', args=(normal_loc, normal_scale))

{
    'moyal_fit': (moyal_loc, moyal_scale),
    'normal_fit': (normal_loc, normal_scale),
    'aic_moyal': aic_moyal,
    'aic_norm': aic_norm,
    'ks_moyal': (float(ks_moyal.statistic), float(ks_moyal.pvalue)),
    'ks_norm': (float(ks_norm.statistic), float(ks_norm.pvalue)),
}
{'moyal_fit': (0.7965922961909722, 0.7011334169300067),
 'normal_fit': (1.6947313365937005, 1.5827992986018258),
 'aic_moyal': 2049.215752722065,
 'aic_norm': 2257.7602247153422,
 'ks_moyal': (0.02229227363429831, 0.9201234117259488),
 'ks_norm': (0.11025845030503556, 8.288996193893122e-07)}
# 10.2 Bayesian modeling: random-walk Metropolis for (mu, sigma)

x = stats.moyal.rvs(loc=1.0, scale=0.9, size=300, random_state=rng)

# Priors: mu ~ Normal(0, 5^2), log(sigma) ~ Normal(0, 1^2)

def log_prior(mu: float, logsigma: float) -> float:
    return float(stats.norm.logpdf(mu, loc=0.0, scale=5.0) + stats.norm.logpdf(logsigma, loc=0.0, scale=1.0))


def log_likelihood(mu: float, logsigma: float, x: np.ndarray) -> float:
    sigma = float(np.exp(logsigma))
    return float(np.sum(moyal_logpdf(x, mu=mu, sigma=sigma)))


def log_posterior(mu: float, logsigma: float, x: np.ndarray) -> float:
    return log_prior(mu, logsigma) + log_likelihood(mu, logsigma, x)


n_steps = 12_000
burn = 2_000
step_mu = 0.08
step_logsigma = 0.06

mu_chain = np.empty(n_steps)
logsig_chain = np.empty(n_steps)

# Initialize at SciPy MLE (usually a decent starting point)
mu_init, sigma_init = stats.moyal.fit(x)
mu_curr = float(mu_init)
logsig_curr = float(np.log(sigma_init))
logp_curr = log_posterior(mu_curr, logsig_curr, x)

accept = 0
for t in range(n_steps):
    mu_prop = mu_curr + step_mu * rng.standard_normal()
    logsig_prop = logsig_curr + step_logsigma * rng.standard_normal()

    logp_prop = log_posterior(mu_prop, logsig_prop, x)

    if np.log(rng.random()) < (logp_prop - logp_curr):
        mu_curr, logsig_curr, logp_curr = mu_prop, logsig_prop, logp_prop
        accept += 1

    mu_chain[t] = mu_curr
    logsig_chain[t] = logsig_curr

accept_rate = accept / n_steps

mu_post = mu_chain[burn:]
sigma_post = np.exp(logsig_chain[burn:])

summary = {
    'accept_rate': accept_rate,
    'mu_mean': float(mu_post.mean()),
    'mu_ci95': (float(np.quantile(mu_post, 0.025)), float(np.quantile(mu_post, 0.975))),
    'sigma_mean': float(sigma_post.mean()),
    'sigma_ci95': (float(np.quantile(sigma_post, 0.025)), float(np.quantile(sigma_post, 0.975))),
}

summary
{'accept_rate': 0.47433333333333333,
 'mu_mean': 1.1168208745438195,
 'mu_ci95': (0.9563590687149044, 1.2839317728708906),
 'sigma_mean': 0.9271449297214939,
 'sigma_ci95': (0.8451908384406034, 1.0167339182440416)}
# Posterior visualization (marginals)

fig_mu = px.histogram(mu_post, nbins=60, title='Posterior of mu', histnorm='probability density')
fig_mu.update_layout(template='plotly_white', xaxis_title='mu')
fig_mu.show()

fig_sigma = px.histogram(sigma_post, nbins=60, title='Posterior of sigma', histnorm='probability density')
fig_sigma.update_layout(template='plotly_white', xaxis_title='sigma')
fig_sigma.show()
# 10.3 Generative modeling: a simple two-component Moyal mixture

n = 60_000
pi = 0.65

params1 = (0.0, 0.8)
params2 = (3.5, 1.1)

component = rng.random(n) < pi
x_mix = np.empty(n)

x_mix[component] = moyal_rvs_numpy(*params1, size=int(component.sum()), rng=rng)
x_mix[~component] = moyal_rvs_numpy(*params2, size=int((~component).sum()), rng=rng)

fig = px.histogram(x_mix, nbins=120, title='Mixture of two Moyal components', histnorm='probability density')
fig.update_layout(template='plotly_white', xaxis_title='x', yaxis_title='density')
fig.show()

11) Pitfalls#

  • Invalid parameters: the scale must satisfy \(\sigma>0\).

  • Interpretation: \(\mu\) is the mode, not the mean.

  • MGF domain: the MGF exists only for \(t < 1/(2\sigma)\) because the right tail is exponential.

  • Numerical issues: direct pdf computation can underflow/overflow for extreme inputs; prefer logpdf and then exponentiate if needed.

  • Goodness-of-fit tests after fitting: KS p-values are not exact when parameters are estimated from the same sample (use as a diagnostic, not a proof).

  • Outliers: the long right tail can make MLE fits sensitive to rare large values; robust alternatives (trimmed fits, mixtures, or heavy-tail modeling) may be more appropriate.

12) Summary#

  • moyal is a continuous location–scale distribution on \(\mathbb{R}\) with a sharp peak and a long exponential right tail.

  • PDF: \(\displaystyle f(x)=\frac{1}{\sigma\sqrt{2\pi}}\exp\left[-\tfrac12\left(z+e^{-z}\right)\right]\) where \(z=(x-\mu)/\sigma\).

  • CDF: \(\displaystyle F(x)=\operatorname{erfc}\!\left(\frac{e^{-z/2}}{\sqrt{2}}\right)\).

  • Mean/variance: \(\mu+\sigma(\gamma+\ln 2)\) and \(\frac{\pi^2}{2}\sigma^2\); skewness and excess kurtosis are constants.

  • Sampling is simple and NumPy-only via the identity \(X=\mu-2\sigma\log|U|\) with \(U\sim\mathcal{N}(0,1)\).

  • SciPy support: scipy.stats.moyal provides pdf, cdf, rvs, and fit.